Search across Different Media : Numeric Data Sets and Text
نویسندگان
چکیده
Digital technology encourages hope of searching across and between different media forms (text, sound, image, numeric data). We describe topic searches in two different media: text files and socio-economic numeric databases and also for transverse searching, whereby retrieved text is used to find topically related numeric data and vice versa. Direct transverse searching across different media is impossible. Descriptive metadata provides enabling infrastructure, but usually requires mappings between different vocabularies and a search term recommender system. Statistical association techniques and natural language processing can help. Searches in socio-economic numeric databases ordinarily require that place and time be specified. Search across different media: Numeric data sets and text files 2
منابع مشابه
Entry Vocabulary - a Technology to Enhance Digital Search
This paper describes a search technology which enables improved search across diverse genres of digital objects { documents, patents, cross-language retrieval, numeric data and images. The technology leverages human indexing of objects in specialized domains to provide increased accessibility to non-expert searchers. Our approach is the reverseengineer text categorization to supply mappings fro...
متن کاملSentiment Analysis of Surveys using both Numeric Ratings and Text Comments
Survey is a common approach for data collection mostly for the purpose of opinion analysis. There are in general two ways of analyzing people’s opinions about some items in surveys. One is based on quantitative data collected from surveys using statistical approaches, and this approach has been around for many years. The second, which is referred to as sentiment analysis, is to extract the atti...
متن کاملThe Integration of the World Wide Web and Intranet Data Resources
The explosive growth in the volume of information available on the Web and in enterprise databases continues unabated. Managing these large quantities of information remains a challenge for both government and industry. TRW’s Digital Media Systems Lab has developed a research platform, InfoWeb, that can be described as an “information infrastructure” that provides seamless access to Web search ...
متن کاملHeterogeneous Metric Learning with Joint Graph Regularization for Cross-Media Retrieval
As the major component of big data, unstructured heterogeneous multimedia content such as text, image, audio, video and 3D increasing rapidly on the Internet. User demand a new type of cross-media retrieval where user can search results across various media by submitting query of any media. Since the query and the retrieved results can be of different media, how to learn a heterogeneous metric ...
متن کاملCross-domain Text Classification with Multiple Domains and Disparate Label Sets
Advances in transfer learning have let go the limitations of traditional supervised machine learning algorithms for being dependent on annotated training data for training new models for every new domain. However, several applications encounter scenarios where models need to transfer/adapt across domains when the label sets vary both in terms of count of labels as well as their connotations. Th...
متن کامل